457 research outputs found
Quench Dynamics of Topological Maximally-Entangled States
We investigate the quench dynamics of the one-particle entanglement spectra
(OPES) for systems with topologically nontrivial phases. By using dimerized
chains as an example, it is demonstrated that the evolution of OPES for the
quenched bi-partite systems is governed by an effective Hamiltonian which is
characterized by a pseudo spin in a time-dependent pseudo magnetic field
. The existence and evolution of the topological
maximally-entangled edge states are determined by the winding number of
in the -space. In particular, the maximally-entangled edge
states survive only if nontrivial Berry phases are induced by the winding of
. In the infinite time limit the equilibrium OPES can be
determined by an effective time-independent pseudo magnetic field
\vec{S}_{\mb{eff}}(k). Furthermore, when maximally-entangled edge states are
unstable, they are destroyed by quasiparticles within a characteristic
timescale in proportional to the system size.Comment: 5 pages, 3 figure
A Comparative Study on Regularization Strategies for Embedding-based Neural Networks
This paper aims to compare different regularization strategies to address a
common phenomenon, severe overfitting, in embedding-based neural networks for
NLP. We chose two widely studied neural models and tasks as our testbed. We
tried several frequently applied or newly proposed regularization strategies,
including penalizing weights (embeddings excluded), penalizing embeddings,
re-embedding words, and dropout. We also emphasized on incremental
hyperparameter tuning, and combining different regularizations. The results
provide a picture on tuning hyperparameters for neural NLP models.Comment: EMNLP '1
Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Path
Relation classification is an important research arena in the field of
natural language processing (NLP). In this paper, we present SDP-LSTM, a novel
neural network to classify the relation of two entities in a sentence. Our
neural architecture leverages the shortest dependency path (SDP) between two
entities; multichannel recurrent neural networks, with long short term memory
(LSTM) units, pick up heterogeneous information along the SDP. Our proposed
model has several distinct features: (1) The shortest dependency paths retain
most relevant information (to relation classification), while eliminating
irrelevant words in the sentence. (2) The multichannel LSTM networks allow
effective information integration from heterogeneous sources over the
dependency paths. (3) A customized dropout strategy regularizes the neural
network to alleviate overfitting. We test our model on the SemEval 2010
relation classification task, and achieve an -score of 83.7\%, higher than
competing methods in the literature.Comment: EMNLP '1
Building Program Vector Representations for Deep Learning
Deep learning has made significant breakthroughs in various fields of
artificial intelligence. Advantages of deep learning include the ability to
capture highly complicated features, weak involvement of human engineering,
etc. However, it is still virtually impossible to use deep learning to analyze
programs since deep architectures cannot be trained effectively with pure back
propagation. In this pioneering paper, we propose the "coding criterion" to
build program vector representations, which are the premise of deep learning
for program analysis. Our representation learning approach directly makes deep
learning a reality in this new field. We evaluate the learned vector
representations both qualitatively and quantitatively. We conclude, based on
the experiments, the coding criterion is successful in building program
representations. To evaluate whether deep learning is beneficial for program
analysis, we feed the representations to deep neural networks, and achieve
higher accuracy in the program classification task than "shallow" methods, such
as logistic regression and the support vector machine. This result confirms the
feasibility of deep learning to analyze programs. It also gives primary
evidence of its success in this new field. We believe deep learning will become
an outstanding technique for program analysis in the near future.Comment: This paper was submitted to ICSE'1
An Equal-Size Hard EM Algorithm for Diverse Dialogue Generation
Open-domain dialogue systems aim to interact with humans through natural
language texts in an open-ended fashion. Despite the recent success of super
large dialogue systems such as ChatGPT, using medium-to-small-sized dialogue
systems remains the common practice as they are more lightweight and
accessible; however, generating diverse dialogue responses is challenging,
especially with smaller models. In this work, we propose an Equal-size Hard
Expectation--Maximization (EqHard-EM) algorithm to train a multi-decoder model
for diverse dialogue generation. Our algorithm assigns a sample to a decoder in
a hard manner and additionally imposes an equal-assignment constraint to ensure
that all decoders are well-trained. We provide detailed theoretical analysis to
justify our approach. Further, experiments on two large-scale open-domain
dialogue datasets verify that our EqHard-EM algorithm generates high-quality
diverse responses.Comment: Accepted by ICLR 202
- …